Causal discovery in the geosciences - Using synthetic data to learn how to interpret results
نویسندگان
چکیده
Causal discovery algorithms based on probabilistic graphical models have recently emerged in geoscience applications for the identification and visualization of dynamical processes. The key idea is to learn the structure of a graphical model from observed spatio-temporal data, thus finding pathways of interactions in the observed physical system. Studying those pathways allows geoscientists to learn subtle details about the underlying dynamical mechanisms governing our planet. Initial studies using this approach on real-world atmospheric data have shown great potential for scientific discovery. However, in these initial studies no ground truth was available, so that the resulting graphs have been evaluated only by whether a domain expert thinks they seemed physically plausible. The lack of ground truth is a typical problem when using causal discovery in the geosciences. Furthermore, while most of the connections found by this method match domain knowledge, we encountered one type of connection for which no explanation was found. To address both of these issues we developed a simulation framework that generates synthetic data of typical atmospheric processes (advection and diffusion). Applying the causal discovery algorithm to the synthetic data allowed us (1) to develop a better understanding of how these physical processes appear in the resulting connectivity graphs, and thus how to better interpret such connectivity graphs when obtained from real-world data; (2) to solve the mystery of the previously unexplained connections.
منابع مشابه
Designing an Optimal Pattern of General Medical Course Curriculum: an Effective Step in Enhancing How to Learn
Introduction: In today's world with a vast amount of information and knowledge, medical students should learn how to become effective physicians. Therefore, the competencies required for lifelong learning in the curriculum must be considered. The purpose of this study was to present a desirable general medical curriculum with emphasis on lifelong learning. Methods: The present study was Mixe...
متن کاملUsing Causal Discovery to Track Information Flow in Spatio-Temporal Data - A Testbed and Experimental Results Using Advection-Diffusion Simulations
Causal discovery algorithms based on probabilistic graphical models have emerged in geoscience applications for the identification and visualization of dynamical processes. The key idea is to learn the structure of a graphical model from observed spatio-temporal data, which indicates information flow, thus pathways of interactions, in the observed physical system. Studying those pathways allows...
متن کاملHow Do Medical Students Learn Professionalism During Clinical Education? A Qualitative Study of Faculty Members' and Interns' Experiences
Introduction: Influence the professional personality development and related behaviors is one of the most challenging and complicated issues in medical education. Medical students acquire their professional attitudes gradually during their education in clinical wards which profoundly affects their future manner. This study was performed in order to answer this core question: "Which experiences ...
متن کاملMining Causal Relationships in Multidimensional Time Series
Time series are ubiquitous in all domains of human endeavor. They are generated, stored, and manipulated during any kind of activity. The goal of this chapter is to introduce a novel approach to mine multidimensional time-series data for causal relationships. The main feature of the proposed system is supporting discovery of causal relations based on automatically discovered recurring patterns ...
متن کاملAn Introduction to Inference and Learning in Bayesian Networks
Bayesian networks (BNs) are modern tools for modeling phenomena in dynamic and static systems and are used in different subjects such as disease diagnosis, weather forecasting, decision making and clustering. A BN is a graphical-probabilistic model which represents causal relations among random variables and consists of a directed acyclic graph and a set of conditional probabilities. Structure...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Computers & Geosciences
دوره 99 شماره
صفحات -
تاریخ انتشار 2017